NAME: VIVIAN KERUBO MOSOMI

REGISTRATION NUMBER: SCT212-0062/2021

Problem

**E1** 

Assume we have a computer where the CPI is 1.0 when all memory accesses (including data and instruction accesses) hit in the cache. The cache is a unified (data + instruction) cache of size 256 KB, 4-way set associative, with a block size of 64 bytes. The data accesses (loads and stores) constitute 50% of the instructions. The unified cache has a miss penalty of 25 clock cycles and a miss rate of 2%. Assume 32 bit instruction and data addresses.

a. What is the tag size for the cache?

Tag size = Address size(in bits) – Index bits – Block offset bits

Address size = 32 bits

For index bits, we first calculate the number of blocks.

Thus number of blocks = (Cache size / Block size or block set)

= 262144 / 256

= 1024

Index bits = 
$$log_2(1024) = 10$$

Index bits = 10

Block offset bits =  $log_2(block size)$ 

 $= log_2(64)$ 

= 6

Offset bits = 6

Tag size = 32 - 10 - 6

Tag size = 16 bits

b. How much faster would the computer be if all memory accesses were cache hits?

Ideal CPI (all hits) = 1.0

The real CPI has 2% misses and 25 cycle penalty

Each instruction fetch = 1 access

50% of instructions are loads and stores = 0.5 memory access per instruction

Thus average access per instruction = 1 (fetch) + 0.5(data)

Miss penalty per instruction = 1.5 \* 0.02 \* 25

= 0.75

CPI = 1.0 + 0.75

Actual CPI = 1.75

Speedup = (CPI with no misses) /( CPI with misses)

Speedup = (1.75 / 1.0)

Speedup = 1.75 times faster

## **E2**

### Problem

You purchased an Acme computer with the following features:

- 95% of all memory accesses are found in the cache.
- Each cache block is two words, and the whole block is read on any miss.
- The processor sends references to its cache at the rate of 109 words per second.
- 25% of those references are writes.
- Assume that the memory system can support 109 words per second, reads or writes.
- The bus reads or writes a single word at a time (the memory system cannot read or write two words at once).
- Assume at any one time, 30% of the blocks in the cache have been modified.
- The cache uses write allocate on a write miss.

You are considering adding a peripheral to the system, and you want to know how much of the memory system bandwidth is already used. Calculate the percentage of memory system bandwidth used on the average in the two cases below. Be sure to state your assumptions.

a. The cache is write through.

Fraction of read hits = 0.75 \* 0.95 = 0.7125

Fraction of read misses = 0.75 \* 0.05 = 0.0375

Fraction of write hits = 0.25 \* 0.95 = 0.2375

Fraction of write misses = 0.25 \* 0.05 = 0.0125

For write through cache:

On a read miss memory must send two words to the cache

On a write hit the cache must send a word to memory.

Thus, average words transferred and bandwidth= 0.7125 + (0.0375 \* 2) + (0.2375 \*1) + (0.0125\*3)

$$= 0.35$$

b. The cache is write back.

On a read miss:

- Cache must send two words to memory, and memory must send two words to the cache
- If replaced line is not modified then memory must send two words to the cache

# On a write miss:

- If a line is modified, cache sends two words to memory and vice versa
- If a line is not modified, memory sends two words to cache

```
Thus from the fraction of read hits/miss and write hits and misses: Average words transferred and bandwidth: ((0.7*2+0.3*4)*0.0125))+0.2375+0.0375+0.7125+(0.7*2+0.3*4) = 0.13
```

### **E3**

### Problem

One difference between a write-through cache and a write-back cache can be in the time it takes to write. During the first cycle, we detect whether a hit will occur, and during the second (assuming a hit) we actually write the data. Let's assume that 50% of the blocks are dirty for a write-back cache. For this question, assume that the write buffer for the write through will never stall the CPU (no penalty). Assume a cache read hit takes 1 clock cycle, the cache miss penalty is 50 clock cycles, and a block write from the cache to main memory takes 50 clock cycles. Finally, assume the instruction cache miss rate is 0.5% and the data cache miss rate is 1%. Assuming that on average 26% and 9% of instructions in the workload are loads and stores, respectively, estimate the performance of a write-through cache with a two-cycle write versus a write-back cache with a two-cycle write.

CPU performance = Instruction count \* Number of clock cycles per instruction \* Clock cycle time

Data cache has a miss rate of 1% and instruction cache has a miss rate of 0.5%

26% of instructions are loads and 9% are stores

i)For write through cache:

Miss penalty is 50 cycles

Thus CPI execution = 0.26 + (0.09\*2) + (0.65\*1)

Stall cycles per instruction = (Miss rate for instruction cache \* 50) + Miss rate for data cache \* (0.26 \* 50 + 0.09 \* 50) = 0.425

CPI = 1.09 + 0.425

CPI = 1.515

ii) For write back cache

Stall cycles per instruction = (Miss rate for instruction cache \* 50) + Miss rate for data cache \* (0.26 \* 50 + 0.5\*100)) = 0.5125

CPI = 1.09 + 0.5125

CPI = 1.6025

Thus, write through cache is slower according to the CPI.